Designing the Global Data Warehouse with SPJ Views
نویسندگان
چکیده
A global Data warehouse (DW) integrates data from multiple distributed heterogeneous databases and other information sources. A global DW can be abstractly seen as a set of materialized views. The selection of views for materialization in a DW is an important decision in the implementation of a DW. Current commercial products do not provide tools for automatic DW design. In this paper we provide a generic method that, given a set of SPJqueries to be satisfied by the DW, generates all the ‘significant’ sets of materialized views that satisfy all the input queries. This process is complex since ‘common subexpressions’ between the queries need to be detected and exploited. Our method is then applied to solve the problem of selecting such a materialized view set that fits in the space allocated to the DW for materialization and minimizes the combined overall query evaluation and view maintenance cost. We design algorithms which are implemented and we report on their experimental evaluation.
منابع مشابه
Making Multiple Views Self-Maintainable in a Data Warehouse
A data warehouse collects and maintains a large amount of data from several distributed and heterogeneous data sources. Often the data is stored in the form of materialized views in order to provide fast access to the integrated data, regardless of the availability of the data sources. In this paper we focus on the following problem: for a given set of materialized select-project-join (SPJ) vie...
متن کاملHeuristic Algorithms for Designing a Data Warehouse with SPJ Views
A Data Warehouse (DW) can be abstractly seen as a set of materialized views de ned over relations that are stored in distributed heterogeneous databases. The selection of views for materialization in a DW is thus an important decision problem. The objective is the minimization of the combination of the query evaluation and view maintenance costs. In this paper we expand on our previous work by ...
متن کاملParallel Maintenance of Materialized Views on Personal Computer Clusters
A data warehouse is a repository of integrated information that collects and maintains a large amount of data from multiple distributed, autonomous and possibly heterogeneous data sources. Often the data is stored in the form of materialized views in order to provide fast access to the integrated data. How to maintain the warehouse data completely consistent with the remote source data is a cha...
متن کاملEvolving Materialized Views in Data Warehouse - Evolutionary Computation, 1999. CEC 99. Proceedings of the 1999 Congress on
A data warehouse contains multiple views accessed by queries. One of the most important decisions in designing a data warehouse is the selection of materialized views for the purpose of efficiently implementing decision making. The search space for the selection of materialized views is exponentially large, therefore, heuristics have been used to search a small fraction of the space to get a ne...
متن کاملEvolving Materialized Views in Data Warehouse
A data warehouse contains multiple views accessed by queries. One of the most important decisions in designing a data warehouse is the selection of materialized views for the purpose of eeciently implementing decision making. The search space for the selection of materialized views is exponentially large, therefore, heuristics have been used to search a small fraction of the space to get a near...
متن کامل